Partitioned Schedules for Clustered VLIW Architectures

نویسندگان

  • Marcio Merino Fernandes
  • Josep Llosa
  • Nigel P. Topham
چکیده

This paper presents results on a new approach to partitioning a modulo-scheduled loop for distributed execution on parallel clusters of functional units organized as a VLIW machine. A distinctive characteristic of this architecture is the use of register files organized by means of queues, which results in a number of advantages over conventional schemes, but also requires the development of specific compiling and hardware features. We have investigated a scheme based on copy operations to deal with data values to be consumed more than once during loop execution. Experiments with loop unrolling were also performed in order to optimize both loop execution and the use of machine resources. A partitioning algorithm has been implemented to perform some experiments with the clustered architecture model, an organization widely accepted as being essential for very wide issue machines. 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uniied Cluster Assignment and Instruction Scheduling for Clustered Vliw Microarchitectures

There has been a trend towards microarchitectures that have disjoint register les to reduce the register le access time. The register le is partitioned and a set of functional units is assigned to each partitioned register le. The partitioned register le and its set of functional units constitute a cluster. Instruction scheduling for a clustered microprocessor requires assignment and scheduling...

متن کامل

Compiler-assisted power optimization for clustered VLIW architectures

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler ...

متن کامل

Exploring Energy-Performance Trade-Offs for Heterogeneous Interconnect Clustered VLIW Processors

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving clock speed, reducing energy consumption of the logic, and making design simpler, it introduces extra overheads by way of inter-cluster communication. This communication ...

متن کامل

Stream Execution on Embedded Wide-Issue Clustered VLIW Architectures

Very long instruction word(VLIW-) based processors have become widely adopted as a basic building block in modern Systemon-Chip designs. Advances in clustered VLIW architectures have extended the scalability of the VLIW architecture paradigm to a large number of functional units and very-wide-issue widths. A central challenge with wide-issue clustered VLIW architecture is the availability of pr...

متن کامل

Pragmatic integrated scheduling for clustered VLIW architectures

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Scheduling for clustered architectures involves spatial concerns (where to schedule) as well as temporal concerns (when to schedule). Various clustered VLIW configurations, connectivity types, and inter-cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998